21 research outputs found

    Shibboleth as a Tool for Authorized Access Control to the Subversion Repository System

    Get PDF
    Shibboleth is an architecture and protocol for allowing users to authenticate and be authorized to use a remote resource by logging into the identity management system that is maintained at their home institution. With Shibboleth, a federation of institutions can share resources among users and yet allow the administration of both the user access control to resources and the user identity and attribute information to be performed at the hosting or home institution. Subversion is a version control repository system that allows the creation of fine-grained permissions to files and directories. In this project an infrastructure, Shibbolized Subversion, has been created that consists of a Subversion repository with an Apache web interface that is protected by a Shibboleth authentication system. The infrastructure can allow authorized and authenticated data sharing between institutions yet retains simplicity and protects privacy for users. In addition, it also relieves local administrators from the task of having to perform extra account management for users from other institutions. This paper describes the Shibboleth and Subversion systems, the implementation of the file sharing infrastructure, and issues of attribute maintenance, privacy and security

    Maneuverable Applications: Advancing Distributed Computing

    Get PDF
    Extending the military principle of maneuver into the war-fighting domain of cyberspace, academic and military researchers have produced many theoretical and strategic works, though few have focused on researching the applications and systems that apply this principle. We present a survey of our research in developing new architectures for the enhancement of parallel and distributed applica-tions. Specifically, we discuss our work in applying the military concept of maneuver in the cyberspace domain by creating a set of applications and systems called “ma-neuverable applications.” Our research investigates resource provisioning, application optimization, and cybersecurity enhancement through the modification, relocation, addition or removal of computing resources. We first describe our work to create a system to provision a big data computational re-source within academic environments. Secondly, we present a computing testbed built to allow researchers to study network optimizations of data centers. Thirdly, we discuss our Petri Net model of an adaptable system, which increases its cyber security posture in the face of varying levels of threat from malicious actors. Finally, we present evidence that traditional ideas about extending maneuver into cyberspace focus on security only, but computing can benefit from maneuver in multiple manners beyond security

    Teaching HDFS/MapReduce Systems Concepts to Undergraduates

    Get PDF
    This paper presents the development of a Hadoop MapReduce module that has been taught in a course in distributed computing to upper undergraduate computer science students at Clemson University. The paper describes our teaching experiences and the feedback from the students over several semesters that have helped to shape the course. We provide suggested best practices for lecture materials, the computing platform, and the teaching methods. In addition, the computing platform and teaching methods can be extended to accommodate emerging technologies and modules for related courses

    Teaching HDFS/MapReduce Systems Concepts to Undergraduates

    Get PDF
    This paper presents the development of a Hadoop MapReduce module that has been taught in a course in distributed computing to upper undergraduate computer science students at Clemson University. The paper describes our teaching experiences and the feedback from the students over several semesters that have helped to shape the course. We provide suggested best practices for lecture materials, the computing platform, and the teaching methods. In addition, the computing platform and teaching methods can be extended to accommodate emerging technologies and modules for related courses

    Random Access in Nondelimited Variable-length Record Collections for Parallel Reading with Hadoop

    Get PDF
    The industry standard Packet CAPture (PCAP) format for storing network packet traces is normally only readable in serial due to its lack of delimiters, indexing, or blocking. This presents a challenge for parallel analysis of large networks, where packet traces can be many gigabytes in size. In this work we present RAPCAP, a novel method for random access into variable-length record collections like PCAP by identifying a record boundary within a small number of bytes of the access point. Unlike related heuristic methods that can limit scalability with a nonzero probability of error, the new method offers a correctness guarantee with a well formed file and does not rely on prior knowledge of the contents. We include a practical implementation of the algorithm with an extension to the Hadoop framework, and a performance comparison to serial ingestion. Finally, we present a number of similar storage types that could utilize a modified version of RAPCAP for random access

    Teaching HDFS/MapReduce Systems Concepts to Undergraduates

    Get PDF
    This paper presents the development of a Hadoop MapReduce module that has been taught in a course in distributed computing to upper undergraduate computer science students at Clemson University. The paper describes our teaching experiences and the feedback from the students over several semesters that have helped to shape the course. We provide suggested best practices for lecture materials, the computing platform, and the teaching methods. In addition, the computing platform and teaching methods can be extended to accommodate emerging technologies and modules for related courses

    Synthetic Image Data for Deep Learning

    Full text link
    Realistic synthetic image data rendered from 3D models can be used to augment image sets and train image classification semantic segmentation models. In this work, we explore how high quality physically-based rendering and domain randomization can efficiently create a large synthetic dataset based on production 3D CAD models of a real vehicle. We use this dataset to quantify the effectiveness of synthetic augmentation using U-net and Double-U-net models. We found that, for this domain, synthetic images were an effective technique for augmenting limited sets of real training data. We observed that models trained on purely synthetic images had a very low mean prediction IoU on real validation images. We also observed that adding even very small amounts of real images to a synthetic dataset greatly improved accuracy, and that models trained on datasets augmented with synthetic images were more accurate than those trained on real images alone. Finally, we found that in use cases that benefit from incremental training or model specialization, pretraining a base model on synthetic images provided a sizeable reduction in the training cost of transfer learning, allowing up to 90\% of the model training to be front-loaded

    Teaching HDFS/MapReduce Systems Concepts to Undergraduates

    Get PDF
    This paper presents the development of a Hadoop MapReduce module that has been taught in a course in distributed computing to upper undergraduate computer science students at Clemson University. The paper describes our teaching experiences and the feedback from the students over several semesters that have helped to shape the course. We provide suggested best practices for lecture materials, the computing platform, and the teaching methods. In addition, the computing platform and teaching methods can be extended to accommodate emerging technologies and modules for related courses

    The Multigraph Modeling Tool

    Get PDF

    Measuring the Effects of Thread Placement on the Kendall Square KSR1

    Get PDF
    This paper describes a measurement study of the effects of thread placement on memory access times on the Kendall Square multiprocessor, the KSRl. The KSRl uses a conventional shared memory programming model in a distributed memory architecture. The architecture is based on a ring of rings of 64-bit superscalar microprocessors. The KSRl has a Cache-Only Memory Architecture (COMA). Memory consists of the local cache memoria attached to each processor. Whenever an address is accessed, the data item is automatically copied to the local cache memory module, 80 that access times for subsequent references will be minimal. If a local cache has space allocated for a particular data item, but does not have a current valid copy of that data item, then it is possible for the cache to acquire a valid read-only copy before it is requested by the local processor due to a request by a different processor that happens to pass by on the ring. This automatic prefetching can greatly reduce the average time for a thread to acquire data items. Because of the automatic prefetching, the time required to obtain a valid copy of a data item does not depend simply on the distance from the owner of the data item, but also depends on the placement and number of other processing threads which ehare the same data item. Also, the strategic placement of processing threads helps programs take advantage of the unique features of the memory architecture which help eliminate memory access bottlenecks for shared data sets. Experiments run on the KSRl across a wide variety of thread configurations show that shared memory access is accelerated through strategic placement of threads which share data. The results indicate strategies for improving the performance of applications programs, and illustrate that KSRl memory access times can remain nearly constant even when the number of participating threads increases
    corecore